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ABSTRACT 

We consider methods with which to answer the question "is any observed galaxy cluster 
too unusual for ACDM?" After emphasising that many previous attempts to answer 
this question have fallen foul of a statistical bias which causes them to overestimate the 
confidence levels to which ACDM can be ruled out, we outline a consistent approach 
to these rare clusters which allows the question to be answered. We explicitly separate 
the two procedures of first ranking clusters according to which appears 'most unusual' 
and secondly calculating the probability that such an unusual observation was made in 
a given cosmology. For the ranking procedure we define three properties of individual 
galaxy clusters, each of which are sensitive to changes in cluster populations arising 
from different modifications to the cosmological model. We use these properties to 
define the "equivalent mass at redshift zero" for a cluster - the mass of an equally 
unusual cluster today. This quantity is independent of the observational survey in 
which the cluster was found, which makes it an ideal proxy for ranking the relative 
unusualness of clusters detected by different surveys. We then calculate the probability 
that any cluster could have been observed with this equivalent mass at redshift zero, 
avoiding the a posteriori bias present in many earlier analyses. These two steps are 
performed for a systematic and comprehensive sample of observed galaxy clusters 
and we confirm that none are more than la deviations from the ACDM expectation. 
Whereas we have only applied our method to galaxy clusters, it is applicable to any 
isolated, collapsed, halo. As motivation for future surveys, we also calculate where in 
the mass redshift plane the rarest halo is most likely to be found, giving information 
as to which objects might be the most fruitful in the search for new physics. 

Key words: methods: analytical - methods: statistical - dark matter - large-scale 
structure of Universe - galaxies: clusters 



1 INTRODUCTION 

The concordance ACDM cosmology makes (via the halo 
mass function) definite predictions for the abundance of dark 
matter haloes and for how that abundance evolves with red- 
shift. These haloes are visible to us as galaxy clusters, both 
via their baryonic matter in the form of galaxies and hot, 
diffuse gas and directly via weak lensing measurements. The 
form of the halo mass function is sensitive to many of the 
plausible modifications to ACDM such as primordial non- 
Gaussianity (Matarrese et al. 2000), dark energy models 
(Weller et al. 2002; Baldi & Pettorino 2011) and modific- 
ations to gravity (Schmidt et al. 2009; Ferraro et al. 2011; 
Lombriser et al. 2012) and cluster abundance has also been 
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shown to be competitive in estimating the parameters within 
ACDM (e.g. Mantz et al. 2010), principally fi™, and erg. 

Ongoing surveys are revealing what are expected to be 
the most massive galaxy clusters at ever increasing redshifts, 
and these objects represent a tantalising possibility: because 
the high-mass tail of the halo mass function descends very 
steeply, the observation of even a single galaxy cluster which 
is massive enough at a given redshift has the potential to 
contradict the predictions of ACDM with high significance. 
Following early work by Jimenez & Verde (2009), Holz & 
Perlmutter (2010) and Colombi et al. (2011), a number of 
statistical tests have been used to determine whether any 
of the clusters observed by recent experiments are in ten- 
sion with the standard model of cosmology. These tests have 
fallen broadly into two groups: those which consider expli- 
citly how likely a given cluster is to appear in a ACDM 
universe (which we will refer to as 'rareness' methods), and 
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those which make use of Extreme Value Statistics (EVS). 
EVS methods predict the probabihty distribution function 
for the mass of the most-massive cluster expected within 
a given survey window and have generally found no ten- 
sion with ACDM (Waizmann et al. 2012; Harrison & Coles 
2012; Chongchitnan & Silk 2012) when appUed correctly to 
an a priori defined survey window. In contrast the rare- 
ness methods have chiefly considered the probability that a 
cluster could exist with both greater mass and redshift than 
those observed. Analyses using these methods have yielded 
divergent results: Jee et al. (2009a, 2011); Jimenez & Verde 
(2009); Hoylc et al. (2011a); Enqvist et al. (2011) aU claim 
tension with ACDM at the 95% level and above, whilst Wil- 
liamson et al. (2011); Menanteau et al. (2011) and Stanford 
et al. (2012) do not. 

It was observed in Hotchkiss (2011) that these rareness 
analyses suffer from the (now continued) use of a biased stat- 
istical method, first introduced by Jimenez & Verde (2009) 
and Holz & Perlmuttcr (2010), which causes them to signi- 
ficantly overestimate the amount of tension a particular ob- 
servation may cause. Compounding this issue, the exclusion 
curves of Mortonson et al. (2011) (which contain the same 
bias) have been regularly applied in a way which accounts for 
uncertainty in cosmological parameters in a very conservat- 
ive way. This conservatism acts in the opposite direction to 
the statistical bias which causes overestimation of tension, 
meaning papers using the Mortonson et al. (2011) curves 
have tended not to contradict the predictions of ACDM even 
though they still suffer from the use of the biased > m > z 
statistic. This is discussed in section 4.2. 

Nonetheless, the idea that observations of individual 
clusters are capable of falsifying a given cosmological model 
remains a valid and intriguing possibility even if no tension 
currently exists (see e.g. Hoyle et al. 2011b). Here we provide 
a consistent method to calculate the rareness of observed 
galaxy clusters: first quantifying how extreme each cluster is 
relative to others, before using the unbiased methods advoc- 
ated in Hotchkiss (2011) to determine the degree of tension 
they may cause with the predictions of ACDM. Our work 
here goes beyond that of Hotchkiss (2011) first by the inclu- 
sion of the effects of (cosmological) parameter uncertainty, 
secondly by the inclusion of the effects of (mass) measure- 
ment uncertainty and thirdly by examining a much more 
comprehensive list of clusters (including some discovered 
after Hotchkiss (2011) was published). 

The paper is organised as follows. In section 2 we seek 
to define the problem of rare galaxy clusters, before review- 
ing previous methods of estimating cluster rareness and re- 
affirming why they are biased. In section 3 we describe a 
method of ranking clusters by their relative extremeness, 
we quantify this by their "equivalent mass at redshift zero" , 
also defined in section 3. Section 4 details the construction of 
unbiased measures of absolute rareness, which can be used 
to test the cosmological model and determine where in mass 
and redshift space it may be the most useful to probe. We 
apply these rareness and extremeness measures to a large set 
of observed clusters in section 5, where we rank the clusters 
according to their extremeness and set upper and lower lim- 
its on the tension they represent with ACDM predictions. 
Finally, section 6 contains a discussion of the results and 
concludes. 



We also make our methods available as a numerical code 
online^ for application to new clusters as they are observed. 



2 PREVIOUS MEASURES OF CLUSTER 
RARENESS 

2.1 Rareness 

We will be using the terms 'extremeness' and 'rareness' ex- 
tensively in this text. We use 'extremeness' to refer to the 
relative probability of observing an object (i.e. a very high 
mass cluster is more extreme than a lower mass one at the 
same redshift). We use 'rareness' to refer to the absolute 
probability that any cluster at least as extreme as the one 
in question could have been observed. The distinction is 
important: because of how the expected number of galaxy 
clusters evolves with redshift, lower mass objects can be less 
common (more extreme) if they lie at a higher redshift, and 
lower redshift ones less common (more extreme) if they lie 
at a higher mass. Whilst the extremeness on its own does 
allow us to rank events according to how probable they are, 
we may also frequently be interested in whether the obser- 
vation of any particular event should give us cause to doubt 
our theoretical model for predicting how often particular 
types of events occur. 

In any given statistical model, what we would call the 
'rareness' of an event is then uniquely defined by the probab- 
ility that an event with equal or greater extremeness could 
have occurred. If we make an observation of an event with 
a low occurrence probability, we cannot immediately argue 
that the model which we used to calculate the occurrence 
probability may be wrong. We must be careful to remem- 
ber the difference between the occurrence probability of one 
isolated event and the probability that any equivalent event 
could occur (i.e. the true 'rareness' as defined here). For ex- 
ample, just because the occurrence probability of each par- 
ticular set of lottery numbers is low, we do not have good 
evidence to discount our hypothesis of a 'fair lottery' when, 
inevitably, one particular set of numbers is drawn! 

In order to correctly consider whether an observation 
contradicts our expectations for occurrence probabilities, we 
need to find the space of all events less probable than the 
observed one and find the probability of having observed any 
event within that space. This becomes the rareness, which 
we will denote by 7?. If R turns out to be small (throughout, 
hatted notation refers to observed values), then this does 
give us good evidence that our model is wrong; if we find 
an _R = 0.01, only 1% of realisations of that model will have 
contained any event as extreme as this one. We should ex- 
pect that, over many realisations, the observed rareness R of 
the most extreme event has a linear cumulative distribution 
(i.e. that an event with rareness R ^ R* is observed twice 
as often as one with rareness R ^ R* /2): 

P{R ^ R*) = R*. (1) 

This is a very simple criterion. However (as we will discuss 
in section 2.3) many recent analyses of the ability of rare 
galaxy clusters to rule out cosmological models have fallen 
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foul of using biased statistics for R which do not satisfy 
equation (1). 

2.2 Galaxy Cluster Abundance 

In the standard cosmological model with Gaussian ini- 
tial conditions and hierarchical structure growth, high-mass 
galaxy clusters are expected to evolve from high peaks in the 
initial cold dark matter (CDM) density fluctuations. The 
smallest scales collapse first, before merging over time to 
form over more massive CDM haloes, into which baryons fall 
to form galaxy clusters. Consequently, high mass clusters arc 
expected to be very rare at early times, as reflected in the 
exponential steepness of the halo mass function n{m, z). The 
steepness of this tail is also highly sensitive to the physical 
assumptions which go into the initial conditions and dynam- 
ical evolution of the dark matter over-density field, mean- 
ing the observation of even a single sufficiently extreme (in 
terms of both its mass and redshift) cluster has the potential 
to provide strong evidence against a particular cosmological 
model. 

The number of galaxy clusters expected to occur in a 
survey window covering fraction of the sky /gky and sensitive 
to clusters with masses between rrimin and rrimax at redshifts 
between Zmin and Zmax is given by the integrated product 
of the halo mass function and volume element within this 
region: 



{N) = /sky 



dzdM 



dV dn{M, z) 
dz dM 



(2) 



In real surveys the mass of a halo is not measured directly, 
but via proxies such as X-ray gas temperature Tx, galaxy 
velocity dispersion Ov or the thermal Sunyaev-Zel'dovich 
(tSZ) Compton-j/. The realities of detecting these proxies 
mean that real surveys arc not typically mass limited (al- 
though tSZ surveys approach this) and the use of absolute 
mass and redshift limits is a crude approximation to the 
real selection function. However, in this paper we will en- 
deavour to be conservative with our approximate selection 
functions, providing lower limits on cluster detection prob- 
abilities. The methodology presented here can still be ap- 
plied in the advantageous situation where the full selection 
function is known, and our conclusions are expected to be 
stable. 

Throughout this work, the cosmology assumed is that 
described by the WMAP7-I-BAO-I-H0 ML parameters given 
by Komatsu et al. (2011). From these parameters we calcu- 
late the linear matter power spectrum P{k) using the nu- 
merical Einstein-Boltzmann code CAMB'^ and in turn the 
variance (m, z), smoothed with a top hat window function 
W{k; m) and evolved to a redshift of z with the normalised 
linear growth function D+{z): 

dk 



cr'^{m,z) = dI{z) / 
Jo 



27r 



k'^P{k)W^{k;R). 



(3) 



The calculated a{m, z) is then used in the version of the 
Tinker et al. (2008) mass function: 



dn(m, . 
dm 



= A 



+ 1 



Pm.,0 rfln(g ^) 

m dm 



(4) 



which includes evolving parameters: A = 0.186(1 -|- z) 
a = 1.47(1 + 2)-°-o«, b = 2.57(1 + z)-""''\ c = 1.19. 
This mass function has been well tested against large, high- 
resolution N-body simulations and has become the most fre- 
quently used in cosmological analyses. 



2.3 The Biased > m> z Statistic 

Most of the analyses which have been performed previously 
have calculated the rareness of a cluster (with observed mass 
m and redshift z) via the expected number of clusters which 
exist with both greater mass and redshift: 



(iV> 



/sky 



poo rc 
J z J rh 



dz dM 



dV dn{M,z) 
dz dM 



(5) 
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The rareness Rymyz is then taken to be the Poisson probab- 
ility that Nyrh>i > 1 lu a given universe. However, as first 
pointed out by Fergus Simpson ^ and later expounded in 
Hotchkiss (2011), using Rymyi as a rareness statistic is in- 
correct as it implies that the only clusters more extreme than 
rh, z are those with both a higher mass and redshift. This ig- 
nores the fact that there will be clusters at lower redshift and 
higher mass or higher redshift and lower mass which have 
occurrence probabilities equal to or lower than the cluster 
observed. Far too few clusters are thus counted as more ex- 
treme, and the probability of having observed something 
with a given rareness decreases accordingly. As shown in 
Hotchkiss (2011), Monte Carlo simulations of -R>m>z show 
that it does not obey equation (1) and that the probabil- 
ity of observing a cluster with i?>m>2 = A is significantly 
greater than R. This means that (1 — J?>m>z) should not be 
used as the confidence level to which the observation rules 
out a given cosmology. In fact, the observation of a cluster 
with a small value of i?>m>z is actually expected to be quite 
a common event in a ACDM universe; it turns out that more 
than half of all full sky realisations of ACDM clusters will 
contain a cluster, below z = 2, which shows tension at above 
the 95% level using the R>m>z method! 

This measure was first used explicitly in Holz & Per- 
Imutter (2010) and it and highly similar methods (all of 
which, based on the observed cluster, make an a posteriori 
choice of mass and redshift to integrate from) have led to er- 
roneous over-estimations of the tension of multiple clusters 
with the ACDM model (Jimenez & Verde 2009; Hoyle et al. 
2011a; Cayon et al. 2011; Foley et al. 2011; Jee et al. 2009a, 
2011; Carlesi et al. 2011; Enqvist et al. 2011). 

In addition to these issues, Mortonson et al. (2011) use 
{Nyrh>z) to construct 'exclusion curves' in the mass-redshift 
plane. In this method an a% exclusion curve is defined as the 
set of points for which Ny^y^ = in a% of realisations of 
a% of the allowed parameter space. It is important to point 
out that because the curves (as used in Williamson et al. 
2011; Stanford et al. 2012; Menanteau et al. 2011; Miyatake 
et al. 2012) are constructed using (Nyrhys) they suffer from 
the same bias discussed above, which causes them to over- 
estimate the amount of tension a given cluster is in with a 
cosmological model. The reason that the Mortonson et al. 
(2011) curves have not lead to claims of the same tension 
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with ACDM is due to their exceptionally conservative treat- 
ment of parameter uncertainty. This conservatism works in 
the opposite direction to the bias due to the use of Ny^^y^, 
bringing observed clusters back within the statistically ac- 
ceptable region and is discussed in section 4.2. 

2.4 Extreme Value Statistics 

An alternative approach to rare galaxy clusters is through 
Extreme Value Statistics. In EVS, predictions are made for 

the extroma of samples (rather than their mean as in con- 
ventional analyses) allowing a probability distribution func- 
tion for the mass of the most-massive galaxy cluster within 
a survey window to be drawn. Colombi et al. (2011) were 
the first to apply EVS methods to galaxy clusters, showing 
they made accurate predictions for N-body simulations. A 
series of papers have since considered whether any observed 
cluster is significantly more massive than the expected most- 
massive, with the main differences being in choices of survey 
windows (which must be chosen a priori in order to avoid a 
similar bias to the one in the Rym>z statistic). Waizmann 
et al. (2012) and Chongchitnan & Silk (2012) consider survey 
windows with a large extent in redshift but with /sky cor- 
responding to relevant experiments, whilst Harrison & Coles 
(2012) consider very narrow redshift bins but conservatively 
set /sky = 1. AH of these analyses find that no cluster is in 
tension with the predictions of ACDM. 



3 RANKING CLUSTERS BY EQUIVALENT 
MASS AT REDSHIFT ZERO 

How unusual a galaxy cluster is will be determined by two 
parameters: its mass and redshift. Whilst ranking objects 
according to a single parameter (and conducting EVS on 
this parameter) is trivial, the ranking of multivariate data 
is a problem with no unique solution (see Baxnett 1976, for 
a discussion) . If we wish to rank clusters according to their 
extremality then wo are therefore required to choose a map- 
ping of m and z onto a single parameter by which they may 
then be ranked. Hotchkiss (2011) suggests using the 'equi- 
valent mass at redshift zero' mlp. This quantity is found by 
calculating a particular property of an observed cluster and 
then finding the mass of the cluster at redshift z — which 
has the same value of that property. This way of dealing 
with extremeness has the advantage that m\g is an intrinsic 
property of each cluster and is not dependent on the survey 
it was selected from. This allows both comparison of objects 
from different surveys, as well as providing an intuitive un- 
derstanding of which is more unusual: a high-mass cluster 
at low redshift, or a lower- mass one at high redshift. The 
property chosen to map between m, z and mlp needs to be 
physically well-motivated and can be chosen in such a way 
as to be sensitive to specific modifications of the ACDM cos- 
mology. Here, we define three measures which may be used 
to define the extremeness, mlp. 

3.1 Number with greater mciss and redshift 

> m > z 

Even though it has been used to define the biased R>m>z 
statistic, the expected number of clusters at a greater mass 



and redshift than an observed cluster {Nyrh>z) may still 
be used in an unbiased way. The extremeness, m|,), may 
be defined as the mass at redshift zero which has the same 
expected number of clusters in the space > mlg > as 
the observed cluster has m. > rh > z. Whilst this definition 
may appear intuitive, the number of clusters above a given 
meiss and redshift may be seen as unduly sensitive to the 
background expansion clusters at points in redshift space 
with the largest volume element will be down-weighted in 
the rareness ranking, even if they have a small number of 
expected more massive clusters per unit of volume. 



3.2 Peak height > v 

A more physically motivated choice of parameter is the peak 
height: 

1 



vim, z) oc 



D+{z)a{m) ' 



(6) 



corresponding to the depth into the tail of its distribution 
of the initial density fluctuation which may be associated 
with the cluster. If we are interested in potential modificar 
tions to the CDM initial conditions, as in primordial non- 
Gaussianity, more rare peaks are expected and clusters with 
a higher m|g become more common. 



3.3 Mciss per Unit Volume > mdV 

We may also choose to define m|(j via the expected number 
of more massive clusters per unit volume at a given redshift:'* 



{N>rhdv} — /sky 



dM 



dn{M, z) 



dM 



(7) 



Using this definition has the advantage that it fairly weights 
all clusters at high masses, even those which come from 
low-volume regions in the redshift dimension. 

Figure 1 shows contour lines for each of the three 
parameters used to define mlg, where each line corresponds 
to points with constant values of the relevant parameter. 
As can be seen, the difl^erent definitions do not map points 
onto mlg in the same way. For instance, the v definition will 
assign the highest m|g to the highest point in the initial 
density field, irregardless of how the volume expansion or 
growth law proceeds between that epoch and z = 0, mean- 
ing they appear as steeper contours on the mass-redshift 
plane. In contrast, the tendency of the (A^>m>z) measure to 
downweight very low redshift clusters because of the larger 
volume element at z ~ 0.3 can be seen in the flattening of 
the contours at these low redshifts. 



4 UNBIASED MEASURES OF ABSOLUTE 
CLUSTER RARENESS 

Once the equivalent mass at redshift zero has been used 
to rank clusters and hence identify which are the most ex- 
treme observed, a separate process needs to be undertaken 
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Figure 1. Contour plots for the three mlg statistics, showing 
how points in the m-z plane are mapped to equivalent masses at 
z = 0. 



in order to calculate the level of tension the rarest clusters 
signify with the cosmological model. The conflation of these 
two steps was a key issue in earlier calculations: whilst the 
probability of finding N^^^i ^ 1 is a real probability, may 
be small for an individual cluster, and will be smaller the 
more extreme a cluster is, it should not be used to calculate 
the level of tension with ACDM. This is because of the inher- 
ent a posteriori bias that arises from the implicit assertion 
that no cluster at a lower mass or lower redshift would have 
been assigned the same or lower rareness by the observer. 
An unbiased measure of the tension an observed cluster 



implies with a cosmological model can be found by perform- 
ing the following steps: 

(i) Define and calculate a specific measure of a cluster's 

extremeness (e.g. m|y). 

(ii) Calculate the extremeness at all points on the mass- 
redshift plane. 

(iii) Find the contour of clusters which have an equal ex- 
tremeness to that of the observed cluster. 

(iv) Calculate the probability that the survey in which 
the cluster was found could observe any cluster above this 
contour. 

In this analysis for step (iv) we assume a simple survey selec- 
tion function of a constant limiting mass and upper redshift 
cutoff and assume the probability that a cluster could exist 
above the contour to be Poisson distributed (a good approx- 
imation for very high-mass galaxy clusters). A more soph- 
isticated analysis would be possible on a per-survey basis, 
taking into account the full selection functions (which we 
sadly do not have access to), such as the one which has 
been carried out by Stalder et al. (2012), who perform all 
four of these steps using simulations of their observational 
survey to compare with the observed cluster. 

Figure 2 shows contours of the calculated rareness prob- 
abilities for clusters in an idealised experiment with /sky = 1, 
fimin = 1 X 10^^ Mq /h and Zmax = 6. Thcsc probabilities 
represent the true 'rareness', i.e. if a cluster is observed on 
the 0.99 contour of these plots, then it may be reasonably 
cited as tension at the 99% level with ACDM. It is import- 
ant to point out that these contours depend explicitly on the 
selection function assumed for the survey: the lower mass 
limit, upper redshift limit and fraction of the sky observed. 
The ones shown here refer to a survey which covers the en- 
tire mass-redshift region shown in the plots, but for surveys 
which are expected to be complete to a lower redshift and 
above a higher limiting mass these curves would be lower, 
corresponding to the naive expectation that we are more sur- 
prised to find something globally unusual in a small survey 
than a large one. 

4.1 Location of the rarest clusters 

An interesting question which may be asked of rare clusters 
relates to where in the mass-rcdshift plane we may expect 
the rarest observed cluster to be found. Answering this ques- 
tion can give information about where cluster surveys can 
be most productively targeted, or indeed what kind of ob- 
jects may be most sensitive probes of the tail of the halo 
mass function. The plots in figure 3 show the convolution 
of cluster rareness and the halo mass function, showing the 
probability distribution for the location of the rarest ob- 
served cluster according to each of the three measures de- 
scribed above. The rarest cluster according to the > v meas- 
ure is always most likely to appear at the highest specified 
redshift {z = Q for these plots), whilst the rarest cluster ac- 
cording to the > m > z and > mdV measures are most 
likely to be observed at z ~ 1 and z ~ 2.5 respectively. 

An interesting inference can be made from the > v 
plot with regards to attempts to constrain primordial non- 
Gaussianity with rare objects. The modification to the halo 
mass function caused by primordial non-Gaussianity de- 
pends almost entirely on v. The tendency of surveys to be 
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1 2 3 4 5 6 

Z 



Figure 2. Contours of equal rareness, as defined by the three dif- 
ferent properties in section 4 and corresponding to an ideaUscd ah- 
sky survey which is complete at masses above trimin — 
out to z = 6. The numbers on the figures are 1 — R for that curve, 
representing the confidence level to which ACDM could be ruled 
out by observation of a cluster above that contour 



most likely to find their rarest objects, according to the 
v definitions, at the highest possible redshift (and lower 
absolute masses) indicates that it is perhaps not galaxy 
clusters but higher redshift events such as lensing arcs and 
quasars which may prove the most sensitive probes of non- 
Gaussianity. 



16 




1 2 3 4 5 6 



z 

Figure 3. Heat map of the three statistics, showing where rarest 
clusters are most likely to be observed. 

4.2 Dealing with pcirameter uncertainty 

If we arc seeking to test the ACDM model, we need to take 
into account the uncertainties on the values of the paramet- 
ers within the model correctly and with clarity. Previous 
results using the Mortonson et al. (2011) exclusion curves 
have dealt with both parameter uncertainty p and sample 
uncertainty s together. The exclusion curves arc often drawn 
with the fiducial choice of p — s and referred to as "(1 — p)% 
exclusion curves". This means that the points on the curve 
will have N^^^g = in (1 —p)% of realisations of (1 —p)% 
of the allowed parameter space. This means that for a 95% 
exclusions curve, the probability of observing something in 
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the N^rh>z region of each point is in fact (5%)^ = 0.25%. 
This is statistically valid, but extremely conservative. It is 
this extremely conservative treatment of parameter uncer- 
tainty which has caused contradictory results between the 
Mortonson et al. (2011) exclusion curves and the J?>m>£ 
methods, allowing clusters to not appear in tension with 
ACDM even though the biased rareness method was used 
in the creation of the exclusion curves. These effects work 
against each other; however they do not precisely cancel. 
There are potential situations where the Mortonson et al. 
(2011) curves will fail to find tension when it does in fact 
exist, or erroneously claim tension for clusters that are in 
fact unremarkable. 

Such an extreme degree of conservatism is unnecessary. 
As long as we don't introduce biases or make poor assump- 
tions wo want to be as sensitive to new physics as possible. A 
less conservative, but still statistically robust, way to treat 
parameter uncertainty is to simply marginalise the unbiased 
rareness R over available prior constraints on the cosmolo- 
gical parameters: 

R = j dQR{Q)n(Q), (8) 

where Q is the full set of cosmological parameters and 
is the available prior probability distribution for those para- 
meters. Of the standard model's cosmological parameters, 
it is the normalisation of the linear matter power spectrum 
(Ts which has by far the most significant infiuence on cluster 
abundance. For the analysis below we use a Gaussian prior 
on CT8 from Komatsu et al. (2011), with a mean of 0.811 and 
standard deviation of 0.03. 



5.1 Dealing with measurement uncertainty 

A final consideration to be made when examining thigh- 
mass galaxy clusters is the expected posterior distribution 

for the cluster mass P(m\Xobs)- Here, Xobs is to be under- 
stood as the full set of observable parameters relating to 
the measurement of a cluster's mass. Because the prior dis- 
tribution on cluster mass (the halo mass function) varies 
significantly over the width of the observed likelihood for 
the cluster mass its cS'ect must be taken into account (An- 
dreon 2009). This effect constitutes the classical Eddington 
bias for number counts: because there are signiflcamtly more 
clusters in lower mass bins which may upscatter into higher 
bins than there are high mass clusters to scatter downwards, 
we must adjust our number counts accordingly. Whore ne- 
cessary (where such a correction has not been taken into 
account in the published mass estimate), we therefore use 
distributions for cluster masses given by: 

PK{m\Xohs)dm = \^^-^^PvT{m\Xobs)dm, (9) 

where P\s^{m\Xobs) is the posterior probability distribu- 
tion associated with the quoted observational uncertainty 
on cluster mass (either normal or log-normal) which has 
been calculated using a uniform prior (UP) on cluster mass 
and A is a normalisation constant. The given rareness values 
are then calculated by marginalising over the rareness values 
for the support of this distribution. This method will give 
the correct posterior mass uncertainty for a ACDM prior 
if and only if the original quoted observable uncertainties 
are the statistically correct posterior mass uncertainties ob- 
tained assuming a uniform prior on cluster mass (i.e. essen- 
tially no prior). 



5 RARENESS OF CURRENTLY OBSERVED 
CLUSTERS 

In this section we consider a large number of currently ob- 
served clusters, investigate whether any arc significantly rare 
as to give concern for the standard model using our un- 
biased rareness statistics described above, and seek to rank 
them according to their extremeness. Estimation of cluster 
masses is a procedure fraught with uncertainty. It has been 
found both obscrvationally (sec Rozo et al. 2012, and ref- 
erences therein) and in N-body simulations (Angulo et al. 
2012) that masses (and ordering of most-massive clusters) 
estimated using different proxies are frequently inconsistent 
with each other. Further uncertainty occurs when convert- 
ing between mass definitions for comparison with halo mass 
functions: both a halo profile (frequently NEW) and a mass- 
concentration relation must be assumed, both of which must 
be calibrated using N-body simulations. Such considerations 
are outside the scope of this paper, however. Here we choose 
to search the literature for published estimations of cluster 
masses and take them 'at face value'. This choice naiVely ig- 
nores differences between survey mass proxies and sensitiv- 
ities, which may in reality widen published error estimates, 
and all of our conclusions are predicated on this naiVity. 
However, where robust estimates on cluster mass and un- 
certainty are available our method remains robust. A full 
description of the cluster mass estimates used can be found 
in appendix A. 



5.2 Rarest and most-massive clusters 

Tables 1-3 show the clusters with the ten highest values of 
calculated using each of the three methods discussed 
in section 4. For each of these clusters, wc also find the 
probability that a cluster as rare as this one could have been 
observed by the survey it was measured in. Because we do 
not have access to the full selection functions for the relevant 
surveys, we are not able to make reasonable estimates of the 
true rarenesses of clusters within those surveys. However, wc 
can set both upper and lower limits on cluster rarenesses by 
choosing suitable approximate selection functions. We set 
lower limits on rareness (which correspond to upper limits on 
the degree of tension with ACDM) by choosing the minimal 
survey window in mass-redshift space in which the cluster 
may have been found. We do this by considering only the 
complete (i.e. where the probability of detection — ^ 1) region 
of the survey. Hence we choose high values of mmin and low 
values of Zmax for each survey and only consider /sky for that 
particular survey. These lower limit rareness values are the 
ones shown in tables 1-3. In this conservative treatment, the 
lowest rareness value is found to be Ru = 0.35 for the cluster 
XMMUJ2235-2557, moaning clusters at least as extreme as 
this are found in a high fraction of realisations of ACDM 
cosmologies and the corresponding tension with ACDM is 
only at the 0.65 level - around one sigma. 

Conversely, we can set upper limits on cluster rareness 
(which correspond to lower limits on the amount of tension 



© 0000 RAS, MNRAS 000, 000-000 



8 Harrison & Hotchkiss 



16.5 



^0 



U5.5- 




14.5 



Figure 4. Rareness of currently observed clusters (using the 
> mdV measure described in the text) corresponding to an ideal- 
ised all-sky survey which is complete at masses above rrimin = 
IO^^Mq/H out to z = 2. 



with ACDM) by allowing the clusters to have been found 
anywhere within the mass-redshift plane and setting /sky = 
1. These limits are represented by figure 4, which shows how 
far the clusters with the ten highest mlg""''^ values are from 
the relevant 'exclusion contours' of 1—R for an all-sky survey 
which is complete above masses of 10^* MqH'^ and out to a 
redshift of z = 2. 

It is worth noting that in previous analyses, a high de- 
gree of tension was being found and survey regions to be 
tested were chosen 'conservatively', so as to set lower bounds 
on tension (i.e. upper bounds on rareness). Here we are 'con- 
servative' in both directions: underestimating survey regions 
to set lower (upper) bounds on rareness (tension) and over- 
estimating them to set upper (lower) bounds on rareness 
(tension). 

The framework presented here can still be used to cal- 
culate the true value of rareness, and hence tension shown 
by a cluster, but it would be necessary to consider the 
true probability of having selected a cluster in any region 
of mass-redshift space, either through an analytic selection 
function or comparison with simulated mock catalogues (as 
performed in Stalder et al. 2012) 



6 DISCUSSION AND CONCLUSIONS 

In this paper we have considered an unbiased, consistent 
treatment of rare galaxy clusters. Because previous consid- 
erations of cluster rareness have frequently fallen foul of 
biased statistics and overestimated the amount of tension 
a given observation is in with the ACDM theory, we have 
been careful in defining the probabilities we are calculating 
to avoid a posteriori effects. We have emphasised that, even 
if a particular cluster observation has a low probability of 
occurring, this does not necessarily imply a high amount of 
tension with the cosmological model as many other potential 
observations would have been regarded as just as unusual. 
We first defined a measure of the 'extremeness' of a cluster. 
This was mL, the equivalent mass at redshift zero. We did 



this by considering three physically motivated properties of 
a cluster which may be sensitive to modifications in the cos- 
mology: the number of clusters at greater mass and redshift; 
the peak height v in the CDM over-density from which the 
cluster grew; and the number of clusters with a greater mass 
per unit volume. m|p is then the mass of the notional cluster 
at 2: = which has the same value of these properties. The 
value of m|g is an intrinsic property of each cluster and does 
not depend on the survey in which it was found, meaning 
it is an ideal proxy for categorising and ranking clusters ac- 
cording to their degree of unusualness or 'extremeness'. 

After this ranking step, we turn to the matter of de- 
ciding if any of these calculated m\g masses are so large 
as to cause us to doubt that they were produced by the 
concordance ACDM model. In order to do this we calculate 
the probability that a defined survey were to observe such 
a cluster anywhere in the mass-redshift plane, the absolute 
rareness of a cluster. This is a crucial difference to most 
earlier methods, wherein only clusters which had greater 
mass and redshift were considered as more extreme than 
the one which had been observed. 

We have also considered, by convolving our measure of 
cluster rareness with the halo mass function, where in the 
mass-redshift plane the most unusual clusters in a survey 
are most likely to reside. This provided us with an interest- 
ing result: for the v measure, which is sensitive to primor- 
dial non-Gaussianity, the most unusual cluster is uniformly 
found at the highest redshift available to the survey, mean- 
ing that, in principle, higher-redshift objects (i.e. quasars, 
lensing arcs or Gamma-Ray-Bursts as opposed to galaxy 
clusters) are potentially the more sensitive probes of non- 
Gaussianity in large scale structure. All the methods we 
have presented here are immediately generalisable to any 
isolated and collapsed halo (i.e. not just galaxy clusters). 
Our methods would even allow for comparisons and rank- 
ings of "extremeness" and "rareness" to be made between 
different types of extreme haloes. 

Finally, we have conducted a systematic review of 
cluster mass estimations in the literature. Using conservative 
approximations to survey selection functions and an 'at face 
value' approach to published error estimates, we have cal- 
culated the expected rareness for each cluster, finding that 
none are rarer than the rarest cluster expected in some 35% 
of ACDM universes. 

To facilitate future unbiased estimates of galaxy 
cluster rareness we have made a numerical code 
available at: http://www.mv.helsinki.fi/home/hotchkis/ 
rareness/. This code will calculate m|o, R and a set of ex- 
clusion curves for any sets of clusters that are subsequently 
observed. 
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Table 1. The 10 clusters with highest m[g for the u method described in the text. Where multiple observations have been made, the mass 
estimate with the smallest error was used. Cluster masses marked with a * are Uniform Prior masses and were corrected as described in 
section 5.1 to calculate mlg and rareness. 



Cluster 


m200m 1 h ^Mq 


z 


Mass Reference (Proxy) 


m^o 1 h-^MQ 


Selection 


Rareness 


ACT-CLJ0102-4915 


1.51 ±0.22 X 10^5 


0.87 


Menanteau et al. (2011) (Combined) 


6.55 X 10^5 


ACT 


0.49 


SPT-CLJ2106-5844 


8.93 ± 1.48 X 10" 


1.13 


Foley et al. (2011) (Combined) 


6.01 X 10^5 


SPT 


0.85 


SPT-CLJ0205-5829 


6.17 ±0.96 X 10" 


1.32 


Stalder et al. (2012) (Combined) 


5.77 X 10i5 


SPT 


0.90 


XMMUJ2235-2557 


5.58tif,? X 10"* 


1.39 


Jee et al. (2011) (WL) 


5.02 X 10^"' 


XMM 


0.35 


MACSJ0417.5-1154 


2-86+° M X 10^5* 


0.44 


Piffaretti et al. (2011) (Lx) 


4.96 X 10^5 


MACS 


0.96 


CLJ1226+3332 


1.12t°-J9 X IQis* 


0.89 


Jee et al. (2011) (WL) 


4.55 X 10^5 


WARPS 


> 0.99 


PLCKG266.6-27.3 


8.99 ± 0.67 X 10"* 


0.94 


Planck Collaboration et al. (2011) (Yx) 


4.49 X lOis 


Planck 


> 0.99 


MACSJ2243.3-0935 


2.23+° *| X IQis* 


0.45 


Piffaretti et al. (2011) (Lx) 


4.07 X 1015 


MACS 


0.99 


RDCSJ1252-2927 


5.27t°ofr X 10"* 


1.24 


Jee et al. (2011) (WL) 


4.06 X 10^5 


RDCS 


> 0.99 


MACSJ2211. 7-0349 


2.36lg;^i X IQis* 


0.40 


Pilfaretti et al. (2011) (Lx) 


3.95 X 10i5 


MACS 


> 0.99 



Table 2. The 10 clusters with highest jtjIq for the > m > z method described in the text. Where multiple observations have been made, 
the mass estimate with the smallest error was used. Cluster masses marked with a * are Uniform Prior masses and were corrected as 
described in section 5.1 to calculate m|Q and rareness. 



Cluster 


m200m / h ^Mq 


z 


Mass Reference (Proxy) 




Selection 


Rareness 


ACT-CLJ0102-4915 


1.51 ±0.22 X 1015 


0.87 


Mcnaiiteau et al. (2011) (Combined) 


3.42 X 1015 


ACT 


0.48 


MACSJ0417.5-1154 


2.86to,% X 1015* 


0.44 


Piffaretti et al. (2011) (Lx) 


3.10 X 10i5 


MACS 


0.95 


SPT-CLJ2106-5844 


8.93 ±1.48 X 10" 


1.13 


Foley et al. (2011) (Combined) 


2.75 X 10i5 


SPT 


0.92 


MACSJ2211. 7-0349 


2.36t2-|i X 1015* 


0.40 


Piffaretti et al. (2011) (Lx) 


2.47 X 10i5 


MACS 


> 0.99 


MACSJ2243.3-0935 


2.23t°;g X 1015* 


0.45 


Piffaretti et al. (2011) (Lx) 


2.46 X 1015 


MACS 


0.99 


SPT-CLJ0205-5829 


6.17 ±0.96 X 10" 


1.32 


Stalder et al. (2012) (Combined) 


2.29 X 10i5 


SPT 


0.98 


MACSJ0308.9+2645 


2.15+'^-^^l X 1015* 


0.36 


Piffaretti et al. (2011) (Lx) 


2.19 X 10i5 


MACS 


> 0.99 


CLJ1226+3332 


1.12+;!-i'^ X 1015* 


0.89 


Jee et al. (2011) (WL) 


2.13 X 10i5 


WARPS 


> 0.99 


A1835 


2.161°;^? X 1015* 


0.25 


Okabe et al. (2010) (WL) 


2.05 X 10i5 


LoCuSS 


> 0.99 


ACT-CLJ0658-5557 


1.86 ± 1.53 X 10i5 


0.30 


Marriage et al. (2011) (Lx) 


2.03 X 10i5 


ACT 


> 0.99 
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the mass estimate with the smallest error was used. Cluster masses marked with a * are Uniform Prior masses and were corrected as 
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Cluster 


m200m 1 h ^Mq 


z 


Mass Reference (Proxy) 




Selection 


Rareness 


ACT-CLJ0102-4915 


1.51 ±0.22 X 10^5 


0.87 


Menanteau et al. (2011) (Combined) 
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APPENDIX A: CLUSTER CATALOGUE 

Table Al shows the hst of papers used to construct our 
cluster catalogue. In total 2240 cluster mass estimations 
were included, where measurements in multiple proxies were 
allowed. As mentioned in the text, the mass uncertainties on 
each method were taken to be those given by each paper and 
were assumed to be normally distributed where error regions 
were symmetric and log-normally distributed when asym- 
metric. For the MCXC catalogue (Piffarctti ct al. 2011), 
where no error estimates are given, a log-normal error dis- 
tribution with cTinm = 0.2 wBS assumed, as is fairly typical 
for X-ray observations of clusters. 

All cluster masses are converted to m2oom (the mass 
which is within the cluster region 200 times the average dens- 
ity of the Universe) assuming an NFW halo profile, with a 
single concentration parameter c, which is calculated using 
the concentration-mass relation of Duffy et al. (2008) and 
WMAP7-I-BAO-I-H0 ML parameters from Komatsu et al. 
(2011). 

Selection functions were defined for clusters in order to 
provide upper limits on the tension with ACDM (i.e. lower 
limits on 7?). As such, we estimated the portion of the mass 
redshift plane where a survey was expected to be complete 
(be guaranteed to observe all clusters in that region), using 
the information available from the literature on the surveys. 
The selection functions used to calculate the rareness values 
in Tables 1-3 are detailed below. 



Survey 


A/deg^ 


nimin/h ^Mq 


Zmin 


Zmax 


ACT 


755 


8 X 10" 


0.3 


6.0 


SPT 


2500 


8 X 10" 


0.3 


6.0 


XMM 


80 


4 X 10" 


0.9 


1.5 


MACS 


22735 


8 X 10" 


0.3 


0.7 


WARPS 


72 


8 X 10" 


0.0 


0.6 


PLCK 


41253 


1 X 10^^ 


0.3 


6.0 


RDCS 


50 


8 X 10" 


0.05 


0.8 


LoCuSS 


32085 


3 X 10" 


0.15 


0.3 



We consider the list to be a comprehensive sample of 
observations of the masses of extreme clusters, but would 
gladly accept suggestions of catalogues which could expand 
it. 
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Table Al. Papers used to compile cluster catalogue. Nc;i is the number of clusters contained within each paper and observable mass 
proxies are WL, Weak Lensing; N20O1 cluster richness; Tx, X-ray gas temperature; Yx, integrated X-ray flux; Ygzy integrated Compton-j/; 
<T„, velocity dispersion; Lx, X-ray luminosity and Ci SPT matched filter signal-to-noise. 



Reference 


Selecting Survey (Proxy) 


Nci 


Mass Proxy 


Notes 


Mclnnes et al. (2009) 


SPT (SZ) 


3 


WL 


N/A 


High et al. (2010) 


SPT (SZ) 


21 




N/A 


Suhada et al. (2010) 


SPT (SZ) 


2 


Tv 

^ X 


N/A 


Andersson et al. (2011) 


SPT (SZ) 


15 


Tx,Yx 


N/A 


Brodwin et al. (2010) 


SPT (SZ) 


1 


'7 Vv (Til A^0(U1 


SPT-CLJ0546-5345 


Marriage et al. (2011) 


ACT (SZ) 


23 


Lx 


N/A 


Foley et al. (2011) 


SPT (SZ) 


1 


C,Yx,Tx,cTv 


SPT-CLJ2106-5844 


Stalder et al. (2012) 


SPT (SZ) 


1 


C,Tx 


SPT-CLJ0205-5829 


Planck Collaboration et al. (2011) 


Planck (SZ) 


10 


Yx 


N/A 


Reichardt et al. (2012) 


SPT (SZ) 


224 


C 


N/A 


High et al. (2012) 


SPT (SZ) 


5 


WL 


N/A 


Jee et al. (2009b) 


XMMU (X-ray) 


1 


WL 


XMMU J2235.3 -2557 


Rosati et al. (2009) 


XMMU (X-ray) 


1 


Tx 


XMMU J2235.3 -2557 


Gobat et al. (2011) 


XMMU (X-ray) 


1 


Lx 


Highest z 


Piffaretti et al. (2011) 


Multiple (X-ray) 


1743 


Lx 


MCXC 


Fassbender et al. (2011) 


XDCP (X-ray) 


22 


Lx 


N/A 


Jee et al. (2011) 


Multiple (X-ray) 


22 


WL 


All at 2; > 1 


Suhada et al. (2012) 


XMM-BCS (X-ray) 


46 


Lx 


N/A 


Okabe et al. (2010) 


LoCUSS (WL) 


30 


WL 


N/A 


Demarco et al. (2010) 


SpARCS (Optical) 


3 


CTv 


N/A 


Menanteau et al. (2010) 


SCSO (Optical) 


105 


N2OO 


N/A 


Brodwin et al. (2012) 


IOCS (IR) 


1 


Ysz, Lx 


N/A 


Vulcani et al. (2012) 


EDisCS (Optical) 


1 


WL 


N/A 
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